316 research outputs found

    Bayesian Inference for Retrospective Population Genetics Models Using Markov Chain Monte Carlo Methods

    Get PDF
    Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.PerinnöllisyystieteessÀ eli genetiikassa tutkitaan perinnöllisen aineksen rakennetta, toimintaa ja muuntelua sekÀ muita yksilöiden vÀliseen vaihteluun vaikuttavia tekijöitÀ eliökunnassa. Nykyisten laboratoriomenetelmien avulla on mahdollista kerÀtÀ eliöistÀ yhÀ tarkempia ja laajempia molekyylitason aineistoja. TÀllaisten aineistojen kÀsittelemiseksi tarvitaan tilastollisia malleja, jotka hyödyntÀvÀt mahdollisimman tarkasti kÀytettÀvissÀ olevaa tietÀmystÀ biologisista prosesseista, joiden tuloksena kerÀtyt aineistot ovat muodostuneet. TÀssÀ vÀitöskirjassa kehitetÀÀn BayeslÀisen tilastotieteen malleja erÀille geneettisille prosesseille sekÀ sovelletaan malleja esimerkkiaineistoihin. PÀÀpaino on yksilöiden yhteisen lÀhihistorian mallittamisessa. Yksinkertaisimmillaan lÀhtökohtana on joukko nykyhetken yksilöitÀ, joiden perinnöllinen aines oletetaan tunnetuksi tietyissÀ merkkigeenikohdissa laboratoriossa suoritettujen genotyyppimittausten perusteella. Tilastollista mallia kÀytetÀÀn arvioimaan todennÀköisyyksiÀ erilaisille yksilöitÀ yhdistÀville lÀhihistorioille, jotka kuvataan sukupuurakenteiden sekÀ merkkigeenien periytymisreittien avulla. Tarkasteltavat aikajaksot ovat enintÀÀn kymmeniÀ sukupolvia. VÀitöskirjassa myös hyödynnetÀÀn lÀhihistoriamallia geenikartoitussovelluksessa, jonka tavoitteena on paikallistaa sellaisia kohtia genomista, joilla on vaikutusta tiettyyn yksilöistÀ mitattuun tai havaittuun ominaisuuteen. Muita sovelluskohteita ovat populaatiorakenteen arviointi sekÀ yksilöiden vÀlisten sukulaisuusasteiden arviointi

    linemodels : clustering effects based on linear relationships

    Get PDF
    Estimation of effects of multiple explanatory variables on multiple outcome measures has become routine across life sciences with high-throughput molecular technologies. The linemodels R-package allows a probabilistic clustering of variables based on their observed effect sizes on two outcomes.Peer reviewe

    Estimating quantile treatment effect on the original scale of the outcome variable: a case study of common cold treatments

    Full text link
    The effects of treatments on continuous outcomes can be estimated by the mean difference (i.e. by measurement units) and the relative effect scales (i.e. by percentages), both of which provide only a single effect size estimate over the study population. Quantile treatment effect (QTE) analysis is more informative as it describes the effect of the treatment across the whole population. A drawback of QTE has been that it is usually presented over the quantiles of the control group distribution, whereas presentation over the measurement units is often more informative. We developed a method to estimate back-transformed QTE (BQTE), that presents QTE as a function of the outcome value in the control group, using piecewise linear interpolation and bootstrapping. We further applied the BQTE function to provide informative bounds on the treatment effect at the upper and lower tails of the population. To illustrate the approach, we used 3 data sets of treatment for the common cold: zinc gluconate lozenges, zinc acetate lozenges, and nasal carrageenan. In all data sets, the relative scale provided a better summary of the BQTE distribution than the mean difference. The BQTE approach is particularly useful for describing the variability of effects on the duration of illnesses, length of hospital stay and other continuous outcomes that can vary greatly in the population. Using this method, it is possible to present the QTE by the measurement units, which provides an informative addition to the standard presentation by quantiles.Comment: 23 pages, 4 figure

    MetaPhat : Detecting and Decomposing Multivariate Associations From Univariate Genome-Wide Association Statistics

    Get PDF
    Background: Multivariate testing tools that integrate multiple genome-wide association studies (GWAS) have become important as the number of phenotypes gathered from study cohorts and biobanks has increased. While these tools have been shown to boost statistical power considerably over univariate tests, an important remaining challenge is to interpret which traits are driving the multivariate association and which traits are just passengers with minor contributions to the genotype-phenotypes association statistic. Results: We introduce MetaPhat, a novel bioinformatics tool to conduct GWAS of multiple correlated traits using univariate GWAS results and to decompose multivariate associations into sets of central traits based on intuitive trace plots that visualize Bayesian Information Criterion (BIC) andP-value statistics of multivariate association models. We validate MetaPhat with Global Lipids Genetics Consortium GWAS results, and we apply MetaPhat to univariate GWAS results for 21 heritable and correlated polyunsaturated lipid species from 2,045 Finnish samples, detecting seven independent loci associated with a cluster of lipid species. In most cases, we are able to decompose these multivariate associations to only three to five central traits out of all 21 traits included in the analyses. We release MetaPhat as an open source tool written in Python with built-in support for multi-processing, quality control, clumping and intuitive visualizations using the R software. Conclusion: MetaPhat efficiently decomposes associations between multivariate phenotypes and genetic variants into smaller sets of central traits and improves the interpretation and specificity of genome-phenome associations. MetaPhat is freely available under the MIT license at:.Peer reviewe

    Estimating genealogies from linked marker data: a Bayesian approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Answers to several fundamental questions in statistical genetics would ideally require knowledge of the ancestral pedigree and of the gene flow therein. A few examples of such questions are haplotype estimation, relatedness and relationship estimation, gene mapping by combining pedigree and linkage disequilibrium information, and estimation of population structure.</p> <p>Results</p> <p>We present a probabilistic method for genealogy reconstruction. Starting with a group of genotyped individuals from some population isolate, we explore the state space of their possible ancestral histories under our Bayesian model by using Markov chain Monte Carlo (MCMC) sampling techniques. The main contribution of our work is the development of sampling algorithms in the resulting vast state space with highly dependent variables. The main drawback is the computational complexity that limits the time horizon within which explicit reconstructions can be carried out in practice.</p> <p>Conclusion</p> <p>The estimates for IBD (identity-by-descent) and haplotype distributions are tested in several settings using simulated data. The results appear to be promising for a further development of the method.</p

    The budding and depth of invasion model in oral cancer : A systematic review and meta-analysis

    Get PDF
    Background Tumour budding (B) and depth of invasion (D) have both been reported as promising prognostic markers in oral squamous cell carcinoma (OSCC). This meta-analysis assessed the prognostic value of the tumour budding and depth of invasion combination (BD model) in OSCC. Methods Databases including Ovid MEDLINE, PubMed, Scopus and Web of Science were searched for articles that studied the BD model as a prognosticator in OSCC. PICO search strategy was "In OSCC patients, does BD model have a prognostic power?" We used the reporting recommendations for tumour marker prognostic studies (REMARK) criteria to evaluate the quality of studies eligible for systematic review and meta-analysis. Results Nine studies were relevant as they analysed the BD model for prognostication of OSCC. These studies used either haematoxylin and eosin (HE) or pan-cytokeratin (PCK)-stained resected sections of OSCC. Our meta-analysis showed a significant association of BD model with OSCC disease-free survival (hazard ratio = 2.02; 95% confidence interval = 1.44-2.85). Conclusions The BD model is a simple and reliable prognostic indicator for OSCC. Evaluation of the BD model from HE- or PCK-stained sections could facilitate individualized treatment planning for OSCC patients.Peer reviewe

    Validation of a tail-mounted triaxial accelerometer for measuring foals' lying and motor behavior

    Get PDF
    Foals' locomotory and lying-down behavior can be an indicator of their health and development. However, measurement tools have not been well described with previously reported attachment sites used on limbs of adult horses unsafe for longer-term data collection in foals. In this study, a tail-mounted three-dimensional accelerometer was validated for monitoring foals lying, standing, and walking behavior. Eleven foals were recruited: four hospitalized and seven at private breeding stables. Accelerometers were attached to the dorsal aspect of the base of each foal's tail and their behavior was video recorded. Hospitalized foals had continuous video monitoring inside their stalls, and the breeding stable's foals were monitored outside at pasture for 1-5 periods (mean 42 minutes per period), depending how long they were at the facility. Acceleration was measured using 100 Hz frequency and mean, maximum, and minimum acceleration were recorded in 5 second epochs for x-, y-, and z-axes. Lying, standing, and walking behavior was monitored from videos of all foals, and the start and end time of each behavior was compared with the corresponding data from the accelerometer. Naive Bayes classifier was developed by using dynamic body acceleration and craniocaudal movement of the tail (tilt along z-axis), to predict a foal's lying behavior. The model was validated; the classifier achieved high accuracy in precision and in classifying foals' lying behavior (specificity, 0.92; sensitivity, 0.89; precision, 0.98; accuracy, 0.92). The overall accuracy for classifying walking and standing was also good, but the precision was poor (0.46 and 0.24, respectively). When standing and walking behavior was combined to a single "standing or walking" class, the precision improved (specificity, 0.62; sensitivity, 0.92; precision, 0.89; accuracy, 0.92). In conclusion, tail-mounted three-dimensional accelerometer can be used for monitoring foals' lying behavior. In addition, information regarding standing and walking can be gained with this method. (C) 2020 The Authors. Published by Elsevier Inc.Peer reviewe
    • 

    corecore